Siaya County
Uchaguzi-2022: A Dataset of Citizen Reports on the 2022 Kenyan Election
Mondini, Roberto, Kotonya, Neema, Logan, Robert L. IV, Olson, Elizabeth M, Lungati, Angela Oduor, Odongo, Daniel Duke, Ombasa, Tim, Lamba, Hemank, Cahill, Aoife, Tetreault, Joel R., Jaimes, Alejandro
Online reporting platforms have enabled citizens around the world to collectively share their opinions and report in real time on events impacting their local communities. Systematically organizing (e.g., categorizing by attributes) and geotagging large amounts of crowdsourced information is crucial to ensuring that accurate and meaningful insights can be drawn from this data and used by policy makers to bring about positive change. These tasks, however, typically require extensive manual annotation efforts. In this paper we present Uchaguzi-2022, a dataset of 14k categorized and geotagged citizen reports related to the 2022 Kenyan General Election containing mentions of election-related issues such as official misconduct, vote count irregularities, and acts of violence. We use this dataset to investigate whether language models can assist in scalably categorizing and geotagging reports, thus highlighting its potential application in the AI for Social Good space.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- Africa > Kenya > Bomet County > Bomet (0.05)
- (34 more...)
Construction and application of artificial intelligence crowdsourcing map based on multi-track GPS data
Wang, Yong, Zhou, Yanlin, Ji, Huan, He, Zheng, Shen, Xinyu
In recent years, the rapid development of high-precision map technology combined with artificial intelligence has ushered in a new development opportunity in the field of intelligent vehicles. High-precision map technology is an important guarantee for intelligent vehicles to achieve autonomous driving. However, due to the lack of research on high-precision map technology, it is difficult to rationally use this technology in the field of intelligent vehicles. Therefore, relevant researchers studied a fast and effective algorithm to generate high-precision GPS data from a large number of low-precision GPS trajectory data fusion, and generated several key data points to simplify the description of GPS trajectory, and realized the "crowdsourced update" model based on a large number of social vehicles for map data collection came into being. This kind of algorithm has the important significance to improve the data accuracy, reduce the measurement cost and reduce the data storage space. On this basis, this paper analyzes the implementation form of crowdsourcing map, so as to improve the various information data in the high-precision map according to the actual situation, and promote the high-precision map can be reasonably applied to the intelligent car.
- North America > United States > New York (0.04)
- North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
- North America > United States > California > Santa Clara County > San Jose (0.04)
- (5 more...)
- Transportation > Ground > Road (0.89)
- Information Technology (0.67)
BART-SIMP: a novel framework for flexible spatial covariate modeling and prediction using Bayesian additive regression trees
Jiang, Alex Ziyu, Wakefield, Jon
Prediction is a classic challenge in spatial statistics and the inclusion of spatial covariates can greatly improve predictive performance when incorporated into a model with latent spatial effects. It is desirable to develop flexible regression models that allow for nonlinearities and interactions in the covariate structure. Machine learning models have been suggested in the spatial context, allowing for spatial dependence in the residuals, but fail to provide reliable uncertainty estimates. In this paper, we investigate a novel combination of a Gaussian process spatial model and a Bayesian Additive Regression Tree (BART) model. The computational burden of the approach is reduced by combining Markov chain Monte Carlo (MCMC) with the Integrated Nested Laplace Approximation (INLA) technique. We study the performance of the method via simulations and use the model to predict anthropometric responses, collected via household cluster samples in Kenya.
- North America > United States (0.46)
- Africa > Kenya > Nairobi City County > Nairobi (0.04)
- Africa > Kenya > Mombasa County > Mombasa (0.04)
- (25 more...)
- Research Report > New Finding (0.46)
- Research Report > Experimental Study (0.46)
Two Shifts for Crop Mapping: Leveraging Aggregate Crop Statistics to Improve Satellite-based Maps in New Regions
Kluger, Dan M., Wang, Sherrie, Lobell, David B.
Crop type mapping at the field level is critical for a variety of applications in agricultural monitoring, and satellite imagery is becoming an increasingly abundant and useful raw input from which to create crop type maps. Still, in many regions crop type mapping with satellite data remains constrained by a scarcity of field-level crop labels for training supervised classification models. When training data is not available in one region, classifiers trained in similar regions can be transferred, but shifts in the distribution of crop types as well as transformations of the features between regions lead to reduced classification accuracy. We present a methodology that uses aggregate-level crop statistics to correct the classifier by accounting for these two types of shifts. To adjust for shifts in the crop type composition we present a scheme for properly reweighting the posterior probabilities of each class that are output by the classifier. To adjust for shifts in features we propose a method to estimate and remove linear shifts in the mean feature vector. We demonstrate that this methodology leads to substantial improvements in overall classification accuracy when using Linear Discriminant Analysis (LDA) to map crop types in Occitanie, France and in Western Province, Kenya. When using LDA as our base classifier, we found that in France our methodology led to percent reductions in misclassifications ranging from 2.8% to 42.2% (mean = 21.9%) over eleven different training departments, and in Kenya the percent reductions in misclassification were 6.6%, 28.4%, and 42.7% for three training regions. While our methodology was statistically motivated by the LDA classifier, it can be applied to any type of classifier. As an example, we demonstrate its successful application to improve a Random Forest classifier.
- Africa > Kenya > Western Province (0.34)
- Africa > Kenya > Siaya County > Siaya (0.05)
- North America > United States > California > Santa Clara County > Stanford (0.04)
- (5 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.54)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Malaria infection and severe disease risks in Africa
Understanding how changes in community parasite prevalence alter the rate and age distribution of severe malaria is essential for optimizing control efforts. Paton et al. assessed the incidence of pediatric severe malaria admissions from 13 hospitals in East Africa from 2006 to 2020 (see the Perspective by Taylor and Slutsker). Each 25% increase in community parasite prevalence shifted hospital admissions toward younger children. Low rates of lifetime infections appeared to confer some immunity to severe malaria in very young children. Children under the age of 5 years thus need to remain a focus of disease prevention for malaria control. Science , abj0089, this issue p. [926][1]; see also abk3443, p. [855][2] The relationship between community prevalence of Plasmodium falciparum and the burden of severe, life-threatening disease remains poorly defined. To examine the three most common severe malaria phenotypes from catchment populations across East Africa, we assembled a dataset of 6506 hospital admissions for malaria in children aged 3 months to 9 years from 2006 to 2020. Admissions were paired with data from community parasite infection surveys. A Bayesian procedure was used to calibrate uncertainties in exposure (parasite prevalence) and outcomes (severe malaria phenotypes). Each 25% increase in prevalence conferred a doubling of severe malaria admission rates. Severe malaria remains a burden predominantly among young children (3 to 59 months) across a wide range of community prevalence typical of East Africa. This study offers a quantitative framework for linking malaria parasite prevalence and severe disease outcomes in children. [1]: /lookup/doi/10.1126/science.abj0089 [2]: /lookup/doi/10.1126/science.abk3443
- Africa > East Africa (0.66)
- Africa > Uganda > Western Region > Kabale District (0.04)
- Africa > Tanzania (0.04)
- (3 more...)
MasakhaNER: Named Entity Recognition for African Languages
Adelani, David Ifeoluwa, Abbott, Jade, Neubig, Graham, D'souza, Daniel, Kreutzer, Julia, Lignos, Constantine, Palen-Michel, Chester, Buzaaba, Happy, Rijhwani, Shruti, Ruder, Sebastian, Mayhew, Stephen, Azime, Israel Abebe, Muhammad, Shamsuddeen, Emezue, Chris Chinenye, Nakatumba-Nabende, Joyce, Ogayo, Perez, Aremu, Anuoluwapo, Gitau, Catherine, Mbaye, Derguene, Alabi, Jesujoba, Yimam, Seid Muhie, Gwadabe, Tajuddeen, Ezeani, Ignatius, Niyongabo, Rubungo Andre, Mukiibi, Jonathan, Otiende, Verrah, Orife, Iroro, David, Davis, Ngom, Samba, Adewumi, Tosin, Rayson, Paul, Adeyemi, Mofetoluwa, Muriuki, Gerald, Anebi, Emmanuel, Chukwuneke, Chiamaka, Odu, Nkiruka, Wairagala, Eric Peter, Oyerinde, Samuel, Siro, Clemencia, Bateesa, Tobius Saul, Oloyede, Temilola, Wambui, Yvonne, Akinode, Victor, Nabagereka, Deborah, Katusiime, Maurice, Awokoya, Ayodele, MBOUP, Mouhamadane, Gebreyohannes, Dibora, Tilaye, Henok, Nwaike, Kelechi, Wolde, Degaga, Faye, Abdoulaye, Sibanda, Blessing, Ahia, Orevaoghene, Dossou, Bonaventure F. P., Ogueji, Kelechi, DIOP, Thierno Ibrahima, Diallo, Abdoulaye, Akinfaderin, Adewale, Marengereke, Tendai, Osei, Salomey
We take a step towards addressing the under-representation of the African continent in NLP research by creating the first large publicly available high-quality dataset for named entity recognition (NER) in ten African languages, bringing together a variety of stakeholders. We detail characteristics of the languages to help researchers understand the challenges that these languages pose for NER. We analyze our datasets and conduct an extensive empirical evaluation of state-of-the-art methods across both supervised and transfer learning settings. We release the data, code, and models in order to inspire future research on African NLP.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Africa > Niger (0.05)
- (52 more...)